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(57) Abstract 

In a digital broadcasting system (DAB), audio is transmitted as a continuous stream in audio frames. According to the invention 
this is achieved by transmitting exactly one audio frame in the data field of each data group. Audio can be transferred as a file in which 
case a given number of successive audio frames make up an audio file and the audio file is transferred in accordance with the file transfer 
protocol of the system. In this case, the transfer speed may vary. Audio can be transferred in packet mode as a continuous stream and the 
audio frames are transferred at the same speed by forming successive data groups from audio frames coming as a stream and by selecting 
the transfer speed of the data packets so that the transfer speed of an audio frame assigned to the data group is the same as the transfer 
speed of an audio frame transmitted as a continuous stream. 
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TRANSPORT OF AUDIO IN A DIGITAL BROADCASTING SYSTEM 

The present invention relates to audio transport in 
a digital broadcasting system in which services can be 
5 transported as a continuous stream or in data packets. 

The Digital Audio Broadcasting (DAB) system, which 
has been developed to allow an efficient utilization of 
frequency bands, uses a completely digital transmission 
path. The system is designed to replace the analogue 
10 broadcasting system commonly used at present, which is 
based on the use of frequency modulation. DAB defines a 
digital radio channel based on multiple carriers, which 
is applicable for the transmission of both audio and 
data services. A completely digital transmission channel 
15 may be either a continuous data stream channel or a 
packet channel. Packet transmission is more flexible and 
allows easier transmission of data units of finite 
length. The DAB system is presented in ETSI (European 
Telecommunication Standards Institute) standard 3 00 4 01, 
20 February, 19 95. 

From the user's point of view, the highest level of* 
abstraction in the DAB system is called ensemble, Fig. 
1. It contains all services existing in a given fre- 
quency band. A change from one ensemble to another is 
25 effected by tuning to a different frequency band, just 
as one changes channels in current FM radio reception. 
The ensemble is divided into services, exemplified in 
Fig. 1 by Alpha Radio 1, Beta Radio and Alpha Radio 2. 
In addition, there may be data services, although they 
30 are not shown in the figure. Each service is further di- 
vided into service components. A service component can 
be transported either via an audio channel or via a data 
channel. For comparison, let it be stated that FM radio 
contains only one service and one service component 
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(audio) in each channel. At the lowest level, the trans- 
mission frame, whose duration is 24 ms or 96 tns depend- 
ing on the DAB mode, consists of three chronologically 
consecutive parts. The first part is a synchronizing 
5 channel, which contains no service information. The next 
part is a fast information channel FIC, which has a 
mode-specific fixed length. The last part is a main 
service channel MSC, which contains all the subchannels. 
The position, size and number of subchannels within the 
10 MSC may vary, but the size of the MSC is constant . The 
MSC contains a maximum of 63 different audio and/or data 
subchannels. The subchannels are numbered on the basis 
of a so-called sub-Channel Id from 0 to 62. Moreover, 
the MSC may contain an auxiliary information channel 
t5 AIC, which has a fixed channel number 63. The AIC may 
contain the same type of information as the FIC. The MSC 
information is transmitted using time interleaving such 
that in DAB mode I the MSC part of the frame is divided 
into four parts and these are placed in successive 
20 transmission frames. These parts are known as CIF 
(Common Interleaved Frame) , so the MSC part of the frame 
in mode I contains four CIFs. In other modes no inter- 
leaving is used, so the MSC is the same as the CIF. 

At the transmitting end, besides audio services, 
25 the service supplier may also provide data services and 
e.g. multimedia services. From the audio information and 
data supplied by the service suppliers, the DAB operator 
produces a DAB transmission signal, which consists of 
successive transmission frames such as those presented 
30 in the lower part of Fig. 1. 

In the receiver, the information channel FIC and 
the channel MSC containing the audio and data services 
are separated from each other from the transmission 
frame. The subchannels are separated and channel decoded 
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and then passed on for further processing.- From the ■ in- 
formation in the received FIC channel, the user will 
know which services are included in the ensemble re- 
ceived and is thus able to select a desired service or 
5 services. By combining subchannel service components ac- 
cording to the application programme, it is possible to 
compose e.g. a desired multimedia service. 

One advantage of the DAB system is that data ca- 
pacity can be reserved for different service suppliers 
10 on a dynamic basis. The capacity may be 1.72 8 Mbit/s at 
most. The data is transmitted in packets as illustrated 
by Fig. 2, consisting of a header field, a data field 
and a checksum. The meanings of the fields are in accor- 
dance with the DAB standard. The packet header contains 
15 packet length data (Pkt Len) , a continuity index (Cont 
Ind) , first/last packet data (First/Last) , an address 
(Pkt Address) identifying the service component, a com- 
mand (Command) and the actual data field length (Data 
Len) . The data field contains the actual data to be 
20 transmitted plus fill bits if required. At the end is 
the packet checksum (Pkt CRC) . 

By combining the data fields of packets in the re- 
ceiver, a so-called data group is formed, Fig. 2B. The 
packets are formed at the transmitting end from the data 
25 group by simply cutting ; It into sections and placing 
each section into the data field of a data packet. Gen- 
erally a data group consists of the data fields of a 
number of consecutive packets transmitted. In the sim- 
plest case, one packet is sufficient to form a data 
30 group . 

The data group is formed as illustrated by Fig. 3. 
The meanings of the abbreviations of the data group 
header and session header fields are as indicated in the 
table below: 



WO 97/17776 



4 



Data Group header 


Session header 


EXT FL extension flag 
CRC FL CRC flag 
SES FL session flags 
DG TYPE data group type 
CONTIND continuity index 
REP IND repetition index 
EXT FIELD extension field 


LAST FL last 
SEG NUM segment number 
RFA reserved for future applications 
LEN IND length of next address field 
ADDR FIELD end user s address 



These header fields are followed by the actual 
data and the data group checksum DG CRC. 
5 A continuous audio stream is transmitted in frames 

having a structure as illustrated by Fig. 4. At the 
transmitting end, 16 -bit PCM coded audio samples coming 
at a frequency of 4 8 kHz are divided into sub-bands and 
the samples of the sub-bands are encoded into the audio 
10 frame by making use of the masking effect of the human 
ear so that the incoming bit rate 768 kbit/s is reduced 
e.g. in the case of a mono channel to a rate of about 
100 kbit/s. The four-byte header of the frame contains 
information intended for the decoders in the receiver, 
15 such as synchronization data, bit rate data, and sam- 
pling frequency data. A bit allocation field coming af- 
ter the checksum indicates how the bits are allocated to 
each one of the audio field sub-bands containing 36 
coded samples and which bits have been removed from the 
20 samples in making use of the masking effect. A Scale 
Factor Selection Information field indicates how the 
group of audio samples has been scaled (normalized) in 
the decoder. After this there is a field that contains 
the audio bits proper. The information in it corresponds 
25 to 24 ms of audio. The field contains 36 encoded sub- 
band audio samples divided into twelve triplets, each of 
which contains 3 sub-band samples. Thus, four triplets 
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corresponds to 12 ms of audio. Next there are fill bits 
if the number of audio bits amounts to less than the 
audio field length. Finally there are an X-PAD and an F- 
PAD field, which transmit Programme Associated Data 
(PAD) . This data is in synchronism with the audio of the 
frame. The PAD bytes of successive frames make up a so- 
called PAD channel. 

The audio part of multimedia is proposed to be 
transmitted in audio frames, yet there may be certain 
reasons to transmit audio in packet mode as well. Packet 
mode transmission could in principle be applied e.g. to 
send audio files, which would first be stored in the 
memory of the receiver and then played back via the 
speakers at the appropriate time during the multimedia 
presentation. This type of audio transport has the ad- 
vantage that the file can be transmitted at any bit 
rate, so the transport rate need not be the same as the 
fixed bit rate which is used in the transmission of 
audio frames or an audio stream and which has been as- 
signed a number of allowed bit rate values in the DAB 
specification. 

A disadvantage with this type of transmission is 
that the audio file has to be stored in memory if the 
transport rate is lower than the audio stream bit rate, 

or it needs to be buffered if the transport rate is 

v. - 
higher than the audio stream bit rate. In the former 

case, playback cannot be started immediately upon recep- 
tion of the file, whereas in the case of buffering, 
playback can be started immediately. However, the disad- 
vantage is not a real one, because in most of the possi- 
ble practical applications there is no need to transmit 
an audio file at the audio stream bit rate. The real 
problem is that: the audio scream in audio frames cannot 
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be transmitted in packet mode because there is no mecha- 
nism for indicating the audio frame boundaries. 

However, there are certain applications in which 
it is desirable to implement audio transfer in real time 
5 packet mode. Real time packet transmission here means 
that the bit rate is the same in the transmission of 
both the audio stream and the audio packets, and that it 
would be possible in some way to extract from the pack- 
ets the same information that is contained in the audio 
10 frames. This principle could be applied e.g. to locate 
bit errors in audio frames by comparing the received 
audio frame with the checksum CRC of the received pack- 
ets that carry the same audio information and with the 
checksum CRC of the data group formed from the packets. 
,5 In this way, the audio samples in the audio frames would 
end up being covered by CRC checking. Using packets for 
this purpose would involve extra auxiliary signalling of 
audio, which might be acceptable as a temporary solu- 
tion. If audio transmission is to be protected on a 
20 lasting basis, and as efficiently as packet transmission 
is, it would of course be preferable to improve the er- 
ror protection of the audio bits in audio frames di- 

Another possible application of real time transfer 
25 of audio packets is found in the case where audio is to 
be addressed to a given user group only. This applica- 
tion makes use of the fact that the packets contain an 
address. As shown in the table above, the session header 
contains a field reserved for the end user's address, 
30 ADDR FIELD . This . could be utilized to direct audio in- 
formation to a given group, allowing real time packet 
transmission co be used as an information channel for 
different user groups. The DAB specification also de- 
fines the transmission of announcements via the fast in- 
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formation channel FIC. The definition describes the in- 
terruption of. an active broadcast - type service by an an- 
nouncement, bur the transmission profile for the an- 
nouncement car. only be defined by specifying the serv- 
5 ices to be interrupted. There is no way to address the 
announcement to a given user group only. 

Therefore, the object of the present invention is 
to achieve audio transport in packet mode so as to allow 
both addressed audio transport and indication of audio 
10 frames in a packet stream. 

This object is achieved in the manner described in 
the independent claim. 

According to the first embodiment of the inven- 
tion, audio is transmitted over a packet channel as a 
15 file. An audio frame is placed in the data field of a 
data group. Thus, one segment in. the file transfer pro- 
tocol corresponds to one audio frame. From the packets 
received through the channel, a data group is assembled 
in the normal manner, so the data field of the data 
20 group will contain a file segment, which in this case is 
an audio frame . To enable the audio frames obtained from 
the data field to be arranged in the correct chronologi- 
cal order in the receiver, a file transfer protocol 
needs to be used. Since the transfer can be performed at 
25 any speed, the frames have to be stored or buffered in 
the receiver prior to presentation. 

According to the second embodiment, the audio 
frames are transmitted over a packet channel in the form 
of a continuous stream. An audio frame is placed in the 
30 data field of a data group. The transmission speed is 
exactly the same as the audio channel bit rate. In this 
case, individual data groups are transmitted at exactly 
the correct pace, so the transmission consists of an 
endless train cf data groups. Therefore, no file trans- 
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fer protocol is needed. No buffering needs to be used, 
but the receiver presents the audio frames directly as 
it receives data groups from the packet channel. 

The invention is illustrated by the attached draw- 
5 ings, in which 

Fig. 1 presents a known DAB hierarchy, 

Fig. 2a presents the structure of DAB packets. 

Fig. 2b shows how a data group is formed from 

10 packets, 

Fig. 3 presents the structure of a data group, 

Fig. 4 presents a DAB audio frame, and 

Fig. 5 illustrates the use of an IDG. 
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25 
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According to the invention, to allow the audio 
frame boundaries to be clearly indicated when audio in- 
formation is transmitted . in packet mode, one and only 
one audio frame in its entirety, including the PAD part, 
is assigned to each data group. 

According to the first embodiment, the audio 
frames are transmitted as an audio file, in other words, 
the audio, which has a beginning, a duration and an end, 
constitutes one file. In accordance with the file trans- 
fer principle used in DAB, the file is divided into seg- 
ments and each segment is placed in the data field of a 
data group. The protocol will be briefly described be- 
low. According to the invention, a segment is exactly 
equivalent to an audio frame. Since the audio is trans- 
mitted as a file, it is necessary to use a file transfer 
protocol to enable the receiver to arrange the data 
groups .assembled from the packets into the correct se- 
quence.. The file transfer speed may be higher than, the 
same as or lower than the audio stream bit rate. 
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The file transfer protocol can be implemented ac- 
cording to the general basic principle proposed by the 
Eureka- 14 7 project, in which each segment forms one data 
group. Successive segments of the file are numbered se- 
5 quentially so that the first segment number in the ses- 
sion header is 0. The last segment of the file is indi- 
cated by a flag in the LAST field of the session header 
of the data group formed from it. The receiver receives 
the data packets and forms data groups out of them. If 
10 its checksum indicates that bit errors occurred during 
the transfer, the receiver picks up the data packets of 
the relevant data group from the retransmission of the 
file. 

To enable the receiver to pick up the correct 
15 files from the packet stream transmitted and identify 
the file type in question so as to ensure proper file 
management, the Eureka -14 7 project includes a proposi- 
tion that an additional special Information Data Group 
(IDG) be created. This is a file transfer descriptor, 
20 i.e. it provides the necessary information about the 
file it refers to and it is multiplexed with the file 
segments . 

Fig. 5 illustrates the idea of the IDG in file 
transfer. One IDG is associated with only one file. It 

25 is placed at least at the beginning of the packet stream 
relating to the file, i.e. at the beginning of the file 
transfer, but IDGs can also be placed in the middle of 
the packet stream, in other words, the IDG may appear at 
any point during the file transfer or the IDG can be 

30 transmitted some time before the actual file transfer, 
so it can be used to announce a file transfer before it 
is started. In Fig. 5, files X, Y and Z are transferred 
and the IDGs referring to them are identified by corre- 
sponding letters. The important thing that can be accom- 
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plished by means of the IDG is contained in its data 
field, which is called the 'file descriptor'. The file 
descriptor can be used to give the receiver the required 
detailed information about the file to be transferred. 
5 The file descriptor includes a field containing so- 
called transfer parameters (T-parameters) . A T-parameter 
called File Type declares the type of the file, so the 
application software of the receiver is able to decide 
which algorithm to use for file analysis and interpreta- 

10 tion of its contents. 

The applicant proposes that a new file type pa- 
rameter called "DAB audio" be added to let the receiver 
know that the incoming file announced by the IDG is an 
audio file; 

,5 According to the DAB specification, the informa- 

tion about the services is transmitted over the fast in- 
formation channel (FIC) . It is used to announce the lo- 
cation and nature of the service components. A service 
description for each service is placed in a separate 

20 field, which contains a parameter field called service 
component description. One of the parameters is service 
component type and this parameter is set to File Trans- 
f er . 

From the FIC channel information the receiver thus 
25 learns that a file is being transferred and from the IDG 
information that the file is an audio file. The receiver 
is therefore able to receive the file, decode the audio 
and present it. Depending on the file transfer speed, 
the receiver must either store the audio file or use 

30 buffering for it. 

According to the second embodiment, the audio 
frames are transmitted in packet mode, yet as a continu- 
ous audio stream. The transfer speed is exactly the same 
as it would be if audio frames were used. At the trans- 



WO 97/17776 




PCT/FI96/00S9S 



11 



mitting end, as audio frames are coming as a continuos 
stream, each audio frame is first assigned to a data 
group, then packets are formed from the data group and 
transmitted. The audio frame also contains the PAD. 
5 Thus, the transmission is not a file transfer procedure 
as in the first embodiment, so no file transfer protocol 
is needed. Therefore, no information data groups IDG are 
needed, either. An audio data group may never cross the 
GIF boundary, so the entire data group must be trans - 
10 f erred in a single common interleaved frame CIF. In 
other words, the data group must be transferred in a 
single transmission frame. If repetition is used at the 
data group level, then the data group and all the repe- 
titions of it must be transmitted in a single CIF, i.e. 
15 each repetition is transmitted in a single CIF. 

When a data group is formed, a session header is 
not necessarily needed, but it is advisable to use it 
because it contains an address field ADDR FIELD, which 
is intended for the end user's address. By means of this 
20 address, an audio transmission can be addressed to a de- 
sired user group. If the session header is used, then 
the SEG NUM field is used as a counter that is incre- 
mented by one on every data group. This helps the re- 
ceiver keep in synchronism, because if a data group 
25 fails to be transmitted or is defective, the playback is 
delayed by the duration of an audio frame. The flag in 
the LAST FL field is always zero because the transmis- 
sion is a continuous audio stream, albeit in packet 
mode . 

30 The data group can be interleaved with other 

packet mode service components in the same subchannel, 
but the data group must still fit into a single CIF. If 
a data group is interleaved with other service compo- 
nents carrying audio data groups in the same subchannel, 
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then all the data groups must be transmitted in a single 
CIF. 

As in the case of the first embodiment, the appli- 
cant proposes that the service component type parameter 

5 "DAB Audio Stream" be sent in the service component de- 
scription field in the fast information channel FIC. 

It is obvious to a person skilled in the art that 
technological development permits many different ways of 
implementing the basic idea of the invention. The inven- 

10 tion and its embodiments are thus not limited to the ex- 
amples described above but may be varied within the 
framework of the claims. 
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Claims 

1- Procedure for the transfer of audio in a digi- 
tal broadcasting system, in which an audio programme 
5 transmitted as a continuous stream is transferred in the 
form of audio frames and in which, in packet mode, the 
information to be transmitted is assigned to the data 
field of a data group (DG) and the data group is divided 
into segments which are placed in the data fields of the 
10 data packets, 

characterized in that the audio is transmitted in 
packet mode by placing the audio frame in the data field 
of the data group. 

2. Procedure as defined in claim 1, characterized 
15 in that a certain number of successive audio frames make 

up an audio file and the audio file is transferred in 
accordance with the communication protocol of the sys- 
tem, 

3. Procedure as defined in claim 2, characterized 
20 in that the transfer rate of the audio frames can be 

freely selected within the limits of the packet transfer 
speed of the system. 

4. Procedure as defined in claim 1, characterized 
in that successive data packets are formed from succes- 

25 sive audio frames and that the transfer speed of the 
data packets transferred as a continuous train is so se- 
lected that the transfer speed of an audio frame placed 
in the data group is the same as the transfer speed of 
an audio frame transmitted as a continuous stream. 

50 5. Procedure as defined in claim 5, characterized 

in that the entire data group is transmitted in a single 
transmission frame, in which case the data group is not 
time- interleaved into several transmission frames. 
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6. Procedure as defined in claim 1, characterized 
in that a segment number field in the data group header 
is used as a counter which is incremented by one each 
time a data group is transmitted. 
5 7 . procedure as defined in claim 1, characterized 

in that an address field in the data group header is 
used to address a given user group. 

8. Procedure as defined in claim 1, characterized 
in that the receiver is informed via mechanisms consis- 

10 tent with the system that the data field of the data 
group contains audio information. 

9. Procedure as defined in claim 1, characterized 
in that the digital broadcasting system is DAB. 

10. Procedure as defined in claim 9, characterized 
15 in that the information that the data field of the data 

group contains audio information is conveyed in the File 
Type field of the Information Data Group (IDG) . 

11. Procedure as defined in claim 9, characterized 
in that the information that the audio is transferred as 

20 a file is conveyed via the Fast Information Channel 
(FIC) by setting the service component type parameter to 

File Transfer. 

12. Procedure as defined in claim 9, characterized 
in that the information that the audio is transferred as 

25 an .audio stream is conveyed.;: via the Fast Information 
Channel (FIC) by setting the service component type pa- 
rameter to DAB Audio Stream. 
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